deep architecture
Neural Diffusion Distance for Image Segmentation
The network is a differentiable deep architecture consisting of feature extraction and diffusion distance modules for computing diffusion distance on image by end-to-end training. We design low resolution kernel matching loss and high resolution segment matching loss to enforce the network's output to beconsistent withhuman-labeled image segments.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > China (0.04)
Supplementary Materials for " DropCov: A Simple yet Effective Method for Improving Deep Architectures " Qilong Wang
Our proposed DropCov can be flexibly integrated with existing deep architectures (e.g., CNNs [ Qinghua Hu is the corresponding author and is with Engineering Research Center of City intelligence and Digital Governance, Ministry of Education of the People's Republic of China. VGG-VD on three small-scale fine-grained datasets) show 0.5 is the best choices of As listed in Table S2, we can see that single L T module brings a little gain for plain GCP . Compared to B-CNN + L T (79.62% training accuracy), plain GCP GCP + L T, while B-CNN + L T achieves significant improvement over B-CNN and plain GCP . On the contrary, the samples involving less redundant information (e.g., scene) have large Such these phenomena show the consistency with our finding. Is second-order information helpful for large-scale visual recognition?
- Asia > China > Tianjin Province > Tianjin (0.05)
- Asia > China > Liaoning Province > Dalian (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
DropCov: A Simple yet Effective Method for Improving Deep Architectures
Previous works show global covariance pooling (GCP) has great potential to improve deep architectures especially on visual recognition tasks, where post-normalization of GCP plays a very important role in final performance. Although several post-normalization strategies have been studied, these methods pay more close attention to effect of normalization on covariance representations rather than the whole GCP networks, and their effectiveness requires further understanding. Meanwhile, existing effective post-normalization strategies (e.g., matrix power normalization) usually suffer from high computational complexity (e.g., $O(d^{3})$ for $d$-dimensional inputs). To handle above issues, this work first analyzes the effect of post-normalization from the perspective of training GCP networks. Particularly, we for the first time show that \textit{effective post-normalization can make a good trade-off between representation decorrelation and information preservation for GCP, which are crucial to alleviate over-fitting and increase representation ability of deep GCP networks, respectively}. Based on this finding, we can improve existing post-normalization methods with some small modifications, providing further support to our observation. Furthermore, this finding encourages us to propose a novel pre-normalization method for GCP (namely DropCov), which develops an adaptive channel dropout on features right before GCP, aiming to reach trade-off between representation decorrelation and information preservation in a more efficient way. Our DropCov only has a linear complexity of $O(d)$, while being free for inference. Extensive experiments on various benchmarks (i.e., ImageNet-1K, ImageNet-C, ImageNet-A, Stylized-ImageNet, and iNat2017) show our DropCov is superior to the counterparts in terms of efficiency and effectiveness, and provides a simple yet effective method to improve performance of deep architectures involving both deep convolutional neural networks (CNNs) and vision transformers (ViTs).
Understanding Deep Architecture with Reasoning Layer
Recently, there is a surge of interest in combining deep learning models with reasoning in order to handle more sophisticated learning tasks. In many cases, a reasoning task can be solved by an iterative algorithm. This algorithm is often unrolled, truncated, and used as a specialized layer in the deep architecture, which can be trained end-to-end with other neural components. Although such hybrid deep architectures have led to many empirical successes, theoretical understandings of such architectures, especially the interplay between algorithm layers and other neural layers, remains largely unexplored. In this paper, we take an initial step toward an understanding of such hybrid deep architectures by showing that properties of the algorithm layers, such as convergence, stability and sensitivity, are intimately related to the approximation and generalization abilities of the end-to-end model. Furthermore, our analysis matches nicely with experimental observations under various conditions, suggesting that our theory can provide useful guidelines for designing deep architectures with reasoning layers.
Controlling Steering Angle for Cooperative Self-driving Vehicles utilizing CNN and LSTM-based Deep Networks
Valiente, Rodolfo, Zaman, Mahdi, Ozer, Sedat, Fallah, Yaser P.
A fundamental challenge in autonomous vehicles is adjusting the steering angle at different road conditions. Recent state-of-the-art solutions addressing this challenge include deep learning techniques as they provide end-to-end solution to predict steering angles directly from the raw input images with higher accuracy. Most of these works ignore the temporal dependencies between the image frames. In this paper, we tackle the problem of utilizing multiple sets of images shared between two autonomous vehicles to improve the accuracy of controlling the steering angle by considering the temporal dependencies between the image frames. This problem has not been studied in the literature widely. We present and study a new deep architecture to predict the steering angle automatically by using Long-Short-Term-Memory (LSTM) in our deep architecture. Our deep architecture is an end-to-end network that utilizes CNN, LSTM and fully connected (FC) layers and it uses both present and futures images (shared by a vehicle ahead via Vehicle-to-Vehicle (V2V) communication) as input to control the steering angle. Our model demonstrates the lowest error when compared to the other existing approaches in the literature.
- North America > United States > Florida > Orange County > Orlando (0.14)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Transportation > Ground > Road (0.68)
- Automobiles & Trucks (0.68)
- Information Technology (0.47)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Deep ADMM-Net for Compressive Sensing MRI
yan yang, Jian Sun, Huibin Li, Zongben Xu
Compressive Sensing (CS) is an effective approach for fast Magnetic Resonance Imaging (MRI). It aims at reconstructing MR image from a small number of under-sampled data in k -space, and accelerating the data acquisition in MRI. To improve the current MRI system in reconstruction accuracy and computational speed, in this paper, we propose a novel deep architecture, dubbed ADMM-Net. ADMM-Net is defined over a data flow graph, which is derived from the iterative procedures in Alternating Direction Method of Multipliers (ADMM) algorithm for optimizing a CS-based MRI model. In the training phase, all parameters of the net, e.g., image transforms, shrinkage functions, etc., are discriminatively trained end-to-end using L-BFGS algorithm. In the testing phase, it has computational overhead similar to ADMM but uses optimized parameters learned from the training data for CS-based reconstruction task. Experiments on MRI image reconstruction under different sampling ratios in k -space demonstrate that it significantly improves the baseline ADMM algorithm and achieves high reconstruction accuracies with fast computational speed.
- Asia > China > Shaanxi Province > Xi'an (0.05)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Deep Poisson Factor Modeling
Ricardo Henao, Zhe Gan, James Lu, Lawrence Carin
We propose a new deep architecture for topic modeling, based on Poisson Factor Analysis (PFA) modules. The model is composed of a Poisso n distribution to model observed vectors of counts, as well as a deep hierarchy of hidden binary units. Rather than using logistic functions to characteriz e the probability that a latent binary unit is on, we employ a Bernoulli-Poisson link, which allows PFA modules to be used repeatedly in the deep architecture. We al so describe an approach to build discriminative topic models, by adapting PF A modules. We derive efficient inference via MCMC and stochastic variational met hods, that scale with the number of non-zeros in the data and binary units, yieldin g significant efficiency, relative to models based on logistic links. Experim ents on several corpora demonstrate the advantages of our model when compared to rel ated deep models.
- Asia > Middle East > Jordan (0.05)
- North America > United States > North Carolina > Durham County > Durham (0.04)